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Abstract Networks of neurons in the brain encode 
memories via their synaptic connections. Despite re- 
ceiving considerable attention, the precise relationship 
between network connectivity and encoded activity pat- 
terns is still poorly understood. In particular, given a 
prescribed list of binary patterns, it is not generally 
known how to arrange the connectivity of a network so 
that exactly those patterns are encoded, while avoid- 
ing unwanted "spurious" states. Here we consider this 
problem for networks of threshold-linear neurons. We 
introduce a simple encoding rule that selectively turns 
"on" synapses between neurons that co-appear in one or 
more patterns. The synapses are binary, in the sense of 
having only two states ("on" or "off"), but also graded, 
with heterogeneous weights drawn from an underlying 
synaptic strength matrix S. Our main results provide 
necessary and sufficient conditions on S guaranteeing 
that prescribed patterns can be encoded, while main- 
taining tight control over spurious states. As an appli- 
cation, we construct networks that encode hippocampal 
place field codes nearly exactly. We suggest that, in this 
context, spurious states can be advantageous, allowing 
neural codes to be accurately encoded from a highly 
undcrsampled set of patterns. To obtain our results, we 
use ideas from convex and distance geometry, such as 
Helly's Theorem and Cayley-Menger determinants, re- 
vealing a novel connection between these areas of math- 
ematics and coding properties of neural networks. 



Introduction 

Networks of neurons in the brain encode memories via 
their synaptic connections. These memories are often 
modeled as binary patterns of neural activity associ- 
ated to steady state attractors of a recurrent network 
|19l l2"l[TT] . A binary pattern on n neurons is a string of Os 
and Is, with a 1 for each active neuron and a denoting 
silence; equivalently, it is a subset of (active) neurons 
a C {1, . . . ,n}. Given a prescribed set of binary pat- 
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terns, how can one arrange the connectivity structure 
of a recurrent network such that precisely those pat- 
terns are encoded, while minimizing the emergence of 
unwanted "spurious" states? This problem, which we 
refer to as the Network Encoding (NE) Problem, dates 
back at least to 1982 and has been most commonly 
studied in the context of the Hopheld model [TO1I21H7], 
Consider a network on n neurons that is character- 
ized by a real- valued n x n matrix W, where Wij is 
the connection strength from the jth to the ith neuron. 
To each neuron we associate an activity variable, Xi(t), 
that evolves in time according to a prescription for the 
network dynamics. An encoded pattern of the network, 

a C [n] = {1, . . . ,n}, 

is a binary pattern that can be activated. This means 
there exists an external input to the network such that 
x(t) — (xi(t), . . . , x n (t)) converges to a steady state x* 
(a stable fixed point) with support a: 

a = supp(x*) = {i e [n] \x* > 0}. 

For a given choice of network dynamics, the matrix W 
determines the set of encoded patterns of the network; 
we call this set the code of the network, and denote it 
C(W) C 2[™1, where 2^ is the set of all binary patterns 
(i.e., all subsets of [n]). 

NE Problem. Given a prescribed set of binary pat- 
terns, V C 2["1, find a network W such that V C 
C(W), while minimizing the number of unwanted spu- 
rious states, which are the elements of C(W) \ V '. 

We say that a network W is an exact solution to the 
NE problem for V if there are no spurious states, i.e. if 
C(W) = V . Under what conditions are exact solutions 
possible? If we do not have an exact solution, what are 
the spurious states and how can we control them? Is 
there a biologically plausible encoding rule that can be 
used to construct W from VI 

We take a new look at the NE problem using net- 
works of threshold- linear neurons. To find solutions, we 
investigate a simple encoding rule that operates on an 
inhibitory network and selectively switches "on" exci- 
tatory synapses between neurons that co-appear in one 
or more patterns. A key feature of this rule is the use 
of binary graded synapses. That is, we assume the ex- 
citatory synaptic connections between pairs of neurons 
are not only binary, in the sense that each synapse has 
only two states ( "on" or "off" ) , but also graded, because 
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connection strengths may vary from one synapse to an- 
other. The strengths of "on" synapses are considered to 
be predetermined by the underlying architecture of the 
network, and are given by a synaptic strength matrix S. 
There is, in fact, experimental evidence for hippocam- 
pal synapses that appear binary in this sense [29] , with 
individual synapses exhibiting potentiation in an all-or- 
nothing fashion, but having different "thresholds" for 
potentiation and heterogeneous synaptic strengths. 

Although the NE problem has typically been stud- 
ied assuming uncorrelated (near-orthogonal) neural ac- 
tivity patterns, we make no such assumptions on V . In 
fact, a central motivation for our present work stems 
from the problem of encoding heavily overlapping pat- 
terns corresponding to neural codes in cortical and hip- 
pocampal areas. A simple but important example is the 
case of place field codes (PF codes) in the hippocampus, 
where single neuron activity is characterized by place 
fields [261127] , Because place fields overlap, the activity 
patterns comprising a PF code are highly overlapping. 

Our main results, Theorems [2] and [3] precisely char- 
acterize the codes C(W) that are obtained using our 
encoding rule and, more generally, the sets V of binary 
patterns that admit exact solutions to the NE prob- 
lem via symmetric threshold-linear networks. When V 
is not encoded exactly, we are able to describe the spu- 
rious states, and find that they correspond to cliques 
in the "co-firing" graph of V . These results imply that 
when V is a one-dimensional PF code, our encoding rule 
naturally yields exact solutions to the NE problem for 
V . In the case of two-dimensional PF codes, we generi- 
cally obtain near-exact solutions, as there are very few 
spurious states. Moreover, after applying our encoding 
rule to a random subsampling of patterns, the spuri- 
ous states that arise are typically elements of the full 
PF code. We suggest that - in this context - spurious 
states can be advantageous, allowing PF codes to be 
efficiently encoded from a highly undersampled set of 
patterns. 

Our results use ideas from classical distance and 
convex geometry, such as Cayley-Menger determinants 
[5] and Helly's theorem [5J, establishing a novel con- 
nection between these areas of mathematics and neural 
network theory. 



Background 

Threshold-linear networks. A threshold- linear net- 
work is a firing rate model for a recurrent network [131 
I15U16] where the neurons all have threshold nonlinear- 
ity, (j)(y) = [y}+ = max{y,0}. The dynamics are given 



by, 

^ = -^Xi + 4> WijXj + e,; - d^j , i = 1, n, 

where n is the number of neurons, Xi(t) is the firing 
rate of the ith neuron at time t, is the external in- 
put to the ith neuron, and Qi > is its threshold. Wij 
denotes the effective strength of the recurrent connec- 
tion from the jth to the ith neuron, and the timescale 
Ti > gives the rate at which a neuron's activity de- 
cays to zero in the absence of any inputs. Although 
sigmoids more closely match experimentally measured 
input-output curves for neurons, the above threshold 
nonlinearity is often a good approximation when neu- 
rons are far from saturation [13(131] . Assuming that en- 
coded patterns of a network are in fact realized by neu- 
rons that are firing far from saturation, it is reason- 
able to approximate them as stable fixed points of the 
threshold-linear dynamics. 

We can express the dynamics more compactly as 

x = -Dx + [Wx + b} + , (1) 

where D = f diag(l/ri, l/r n ) is the diagonal matrix 
of inverse time constants, b = (bi,...,b n ) S K n with 
b-i = ei — 9i, and [•]+ is applied elementwise. Note that, 
unlike in the Hopfield model, the "input" to the network 
comes in the form of a constant external drive, b, rather 
than an initial condition x(0). 

The matrix D will be considered fixed, with strictly 
positive diagonal. We will assume homogeneous time- 
scales and use D = I (the identity matrix) for the En- 
coding Rule, but all results apply equally well to het- 
erogeneous timescales. We also assume that —D + W 
has strictly negative diagonal, so that the activity of an 
individual neuron always decays to zero in the absence 
of external or recurrent inputs. Although we consider 
responses to the full range of inputs b g R™, the possi- 
ble steady states of ([I]) are sharply constrained by the 
connectivity matrix W. Assuming fixed D, we refer to 
a particular threshold-linear network simply as W. 

Recall that the code C(W) is the set of all encoded 
patterns of W, and that encoded patterns are binary 
patterns that can be activated as steady states in re- 
sponse to external input. For threshold-linear networks, 
an encoded pattern is exactly the same as a stable set 
(a.k.a. "permitted set" [IB]) of the network, which is a 
non-empty subset of neurons er C [n] with the property 
that, for at least one external input b g K ra , there ex- 
ists an asymptotically stable fixed point x* such that 
a = supp(ir*) llj. It has been previously shown that 
stable sets of W correspond to stable principal subma- 
trices of — D + W jlTJ Theorem 1.2] (see also [IS] for a 
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proof specific to the symmetric case). As usual, a stable 
matrix is a matrix whose eigenvalues all have strictly 
negative real part. For any n x n matrix A, the no- 
tation A a denotes the principal submatrix obtained by 
restricting to the index set a; if a = {si, s^}, then 
A a is the k x k matrix with (A a )ij = A s . Sj . We denote 
the set of all stable principal submatrices of A as 

stab(A) = {a C [n] \ A a is a stable matrix}. 

We can now state the relevant implications of the above. 

Theorem 1 Let W be a threshold-linear network on n 
neurons with dynamics given by equation ([I]), and let 
C(W) be the code ofW. The following two statements 
hold: 

1. C{W) = stab(-D + W). 

2. If W is symmetric, then there exists a symmetric 
n x n matrix A with zero diagonal such that 

C{W) = stab(-ll T + A), 

where — 11 T denotes the n x n matrix of all — Is. 

Statement 1 is a direct consequence of [TTJ Theorem 
1.2]. Statement 2 is Lemma [6] in the Appendix. 

Theorem [T] allows one to find all encoded patterns 
of a given network. Our primary interest, however, is in 
the inverse problem: Given a set of patterns V , can we 
find a network W that encodes precisely those patterns? 
Theorem [l] implies that V admits an exact solution to 
the NE problem if and only if there exists a W such 
that V = stab(— D + W). From this it is easy to infer 
that exact solutions do not always exist (see Corollary [3] 
in the Appendix). If W is not an exact solution for 
V , then what are the spurious states? We tackle these 
questions by analyzing the following Encoding Rule. 
Although this rule yields a restricted set of networks 
W, we will see that the corresponding C(W) encompass 
all possible codes that can be generated by symmetric 
threshold-linear networks. 

Encoding Rule. The encoding rule is a prescription 
for obtaining a network W from a set of binary patterns 
V C 2K 

Step 1: Fix annxn synaptic strength matrix S and 
an e > 0. We think of S and e as intrinsic properties of 
the underlying network architecture, established prior 
to encoding. We use a symmetric encoding rule, and so 
require that Sij = Sji > and Su — 0. 

Step 2: The network W is initialized to be symmetric 
with effective connection strengths Wij = Wji < — 1 
for i ^ j, and Wu = 0. (Beyond this requirement, the 
initial values of W do not affect our results.) 



Step 3: Following presentation of each pattern a e V , 
we turn "on" all excitatory synapses between neurons 
that co-appear in a. This means we update the relevant 
entries of W as follows: 

Wij :— — 1 + eSij if i,j € a and i ^ j. 

In particular, the order of presentation does not mat- 
ter, and once an excitatory connection has been turned 
"on," the value of Wij stays the same regardless of the 
remaining patterns. 

Note that this rule is Hebbian and local; i.e., each 
synapse is updated only in response to the co-activation 
of the two adjacent neurons, and the updates can be 
implemented by presenting only one pattern at a time 

To better understand what kinds of networks and 
codes result from applying the Encoding Rule, observe 
that any initial W in Step 2 can be written as Wij = 

— 1 — eRij, where Rij = Rji > for i ^ j and Ru = 

— 1/e, so that Wu = 0. Assuming a threshold-linear 
network with homogeneous timescales, i.e. fixing D = 
7, the final network W obtained from V after Step 3 
satisfies, 

f-l + eSy,if (ij)eG(V) 
(-D + W)ij = { -1, if i=3 (2) 
[ -1 - eRij if (ij) i G(V), 

where G(P) is the graph on n vertices (neurons) hav- 
ing an edge for each pair or neurons that co-appear in 
one or more patterns of V. We call this graph the co- 
firing graph of V . In essence, the rule allows the network 
to "learn" G(V), selecting which excitatory synapses 
are turned "on" and assigned to their predetermined 
weights. 

Any matrix —D + W obtained via this rule has the 
form — 11 T + eA, where A is a symmetric matrix with 
zero diagonal and off-diagonal entries Ay = Sij > or 
Aij = —Rij < 0, depending on V . It follows from part 
1 of Theorem [l] that the code of this network is given 

by 

C(W) =stab(-ll T + £A). 

Furthermore, we know by part 2 of Theorem [T] that the 
code of any symmetric W is of this form. Hence, al- 
though the Encoding Rule cannot produce all symmet- 
ric networks W, it does yield all possible codes, C(W), 
corresponding to symmetric threshold-linear networks. 

What does the symmetricity of W tell us about 
C(W) and, more generally, the dynamics of the net- 
work? It is easy to see that if W is symmetric, the code 
C(W) = stab(— D + W) has the structure of a simplicial 



4 



C. Curto, A. Degeratu, V. Itskov 



complex^ Recall that an (abstract) simplicial complex 
A C 2^ is a set of subsets of [n] such that the following 
two properties hold: (1) for each i 6 £ A, and 

(2) if a £ A and rCff, then t E A. Property 1 always 
holds for stab(— Z? + W), because —D + W has strictly 
negative diagonal. To check property 2, note that if the 
matrix — D + W is symmetric then Cauchy's interlacing 
theorem applies (Theorem|4]in the Appendix). A conse- 
quence of this theorem is that any principal submatrix 
of a stable symmetric matrix is itself stable (Corollary]!] 
in the Appendix), and so stab(— D + W) satisfies prop- 
erty 2. We are not currently aware of any example of 
a simplicial complex that is not realizable as the code 
of a symmetric threshold-linear network, although it is 
likely that such examples exist. 

In addition to being symmetric, the Encoding Rule 
(for small enough e) generates "lateral inhibition" net- 
works where the matrix —D + W has strictly negative 
entries; in particular, D — W is copositive. It follows 
from [ini Theorem 1] that for all input vectors b £ R ra 
and for all initial conditions, the network dynamics 
converge to a stable fixed point. 



Main Results 

Our main results, Theorems [2] and [3j characterize the 
codes C(W) obtained using the Encoding Rule, as well 
as all sets of binary patterns V that admit exact solu- 
tions to the NE problem via symmetric threshold-linear 
networks. In particular, we find that all clique com- 
plexes and their fc-skeleta admit exact solutions, a fact 
that plays an important role when we later investigate 
encoding of PF codes. 

Recall that the code of any symmetric network on 
n neurons has the form C(W) = stab(— 11 T + eA), for 
e > and A a symmetric n x n matrix with zero di- 
agonal)^] Describing such a code requires understanding 
the stability of principal submatrices that are all of the 
form — 11 T + eA a , which motivates the question: 

Given e > and any symmetric matrix A with zero 
diagonal, when is — 11 T + eA a stable matrix? 

The answer to this question emerges from a surprising 
connection to classical distance geometry, a field that 
grew around the problem of finding conditions for a 
finite set of distances to be realizable from a configura- 

1 This was first observed in 16 , using a version of Theo- 
rem [l] for symmetric W. 

2 In fact, any code of this form can be obtained by perturb- 
ing around any rank 1 matrix - not necessarily symmetric - 
having strictly negative diagonal (Proposition [2] in the Ap- 
pendix). 



tion of points in Euclidean space [8]. In what follows, 
square distance matrices will play a central role. 

Definition 1 An nxn matrix A is a (Euclidean) square 
distance matrix if there exists a configuration of points 
Pif-iPn G M n_1 (not necessarily distinct) such that 
■Aij = \\Pi ~Pj\\ 2 - A is a nondegenerate square distance 
matrix if the corresponding points are affinely indepen- 
dent; i.e., if the convex hull of pi,...,p n is a simplex 
with nonzero volume in R™ -1 . 

A key object for determining whether or not A is a 
nondegenerate square distance matrix is the Cayley- 
Menger determinant, defined as 



cm{A) = det 



0I T 
1 A 



where 1 G W 1 is the column vector of all ones. It is well- 
known that if A is a square distance matrix, cm(vl) 
is proportional to the square volume of the simplex 
obtained as the convex hull of the points {pi} (see 
Lemma[8j in the Appendix). In particular, |cm(A)| > 
if A is a nondegenerate square distance matrix, while 
cm(A) = for any other (degenerate) square distance 
matrix. 

With these notions from distance geometry, we can 
now answer the above question. 

Proposition 1 Let e > 0, and let A be a symmetric 
nxn matrix with zero diagonal. Then the matrix 



-ir 



■eA 



is stable if and only if the following two conditions hold: 

(a) A is a nondegenerate square distance matrix, and 

(b) 0<e< \cm(A)/det(A)\. 

Proposition [l] is a special case of Theorem [5j our core 
technical result, whose statement and proof are given 
in the Appendix. 

The ratio |cm(A)/det(j4)| has a simple geometric 
interpretation (see Remark 1 in the Appendix). More- 
over, since |cm(j4)| > whenever A is a nondegenerate 
square distance matrix, there always exists an e small 
enough to satisfy the second condition, provided the 
first condition holds. Combining Proposition[l]together 
with Cauchy's interlacing theorem yields: 

Lemma 1 If A is an n x n nondegenerate square dis- 
tance matrix, then 



< 



cm(A 



det(A a ) 



< 



cm(A T ) 



det(^ r ) 



if t C er C [r 
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Given any symmetric n x n matrix A with zero diagonal, 
and e > 0, it is now natural to define two simplicial 
complexes in 2^: 

geom e (A) = {a C [n] \ A a a nondeg. sq. dist. matrix, 

cm(A a ) 



and < e < 



det(A a ) 



}, and 



geom(A) = lim geom £ (^4) 

= {a C [n] | A a a nondeg. sq. dist. matrix}. 

Note that if a — {i}, we have A a = [0]. In this case, 
{i} £ geom(A) and {i} G geom £ (^4) for all e > 0, by 
our convention. 

Lemma [I] implies that geom e (A) and geom( A) are 
simplicial complexes, and geom £ (A) = geom(^4) if and 
only if < e < 5(A), where 



6(A) = min 



cm(A a ) 



det(A CT ) 



er£gcom(A) 



It also follows from Lemma [T] that if A is a nondegener- 
ate square distance matrix, then 5(A) = \cm(A)/ det(A)\. 

Applying Proposition [T] to each of the principal sub- 
matrices of the perturbed matrix — 11 T + eA we obtain: 

Corollary 1 If A is a symmetric matrix with zero di- 
agonal, and e > 0, then 

stab(-ll T + eA) = geom £ (A). 

For < e < 5(A), stab(-ll T + eA) = geom(A). 

Next, recall that a clique in a graph G is a subset of 
vertices that is all-to-all connected. The clique complex 
of G, denoted X(G), is the set of all cliques in G; this 
is a simplicial complex for any G. 

Corollary 2 Let A be a symmetric n x n matrix with 
zero diagonal, and e > 0. Let G be the graph on n 
vertices having (ij) € G if and only if Aij > 0. For any 
n x n matrix S with Sij = Sji > and Su = 0, if S 
"matches" A on G (i.e., if Sij — Aij for all (ij) £ G), 
then 

geom E (A) = geom^S") n X(G). 
In particular, geom(j4) = geom(S') n X(G). 



Recall that any n x n synaptic strength matrix S 
used in the Encoding Rule satisfies Sij = Sji > and 
Su = for all i,j e [n]. We are now ready to state our 
main results. 



Theorem 2 Let S and e > be fixed, as in Step 1 
of the Encoding Rule, and let W be the final threshold- 
linear network obtained from a prescribed set of patterns 
V c 2N (equation (l2U Then, 



Ife< 5(S), then 

C(W) = geom(S) n X(G(V)). 



(3) 



Proof Any network W obtained via the Encoding Rule 
(equation §2§) has the form —D + W = -11 T + eA, 
where A is symmetric with zero diagonal and "matches" 
the (nonnegative) synaptic strength matrix S precisely 
on the entries Aij such that (ij) E G(P). All other 
off-diagonal entries of A are negative. It follows that 

C(W) = stab(-ll T + eA) = geom e (A) 
= geom e (S)nA(G(7>)), 

where the last two equalities are due to Corollaries [T] 
and[2j stemming from Proposition [T] □ 

Next we identify a necessary and sufficient condition 
for V to admit an exact solution to the NE problem via 
a symmetric network. 

Theorem 3 Let V C 2K There exists a symmetric 
threshold-linear network W that is an exact solution to 
the NE problem for V if and only if V is a simplicial 
complex of the form 



V = geom £ (S)DX(G(V)), 



(4) 



C(W) = geom E (5) n X(G(T)). 



for some e > and S an nx n matrix with Sij = Sji > 
and Su — for all i, j G [n]. Moreover, W can be 
obtained using the Encoding Rule for V '. 

Proof (<^) This is an immediate consequence of Theo- 
rem [2j Suppose there exists a symmetric network 
W with C(W) = V, and observe by Theorem [I] that 
C(W) — stab(— 11 T + A), for some symmetric n x n 
matrix A with zero diagonal. By Corollaries [T] and [2j 

V = C(W) = geom £ (^l) = geom E (S) n X(G), 

where e = 1, G is the graph associated to A (as in Corol- 
lary [2j) and S is an n x n matrix with Sij — Sji > 
and zero diagonal that "matches" A on G. It remains 
only to show that geom E (S') n X(G) = geom e (5) n 
X(G(V)). Since V = geom e (A), any element (ij) e V 
must have corresponding Aij > 0, so G(V) C G and 
hence X(G(V)) C X(G). On the other hand, V = 
V n X(G(V)), so we conclude that V = geom £ (S') n 
X(G(V)). □ 

Remarks. As an immediate consequence of Theorem[2j 
we know that all clique complexes can be exactly en- 
coded in threshold-linear networks^] If V is a clique 
complex on n vertices (neurons), then V — X(G((P)). 

3 For recent work encoding cliques in Hopfield networks, 
see [H]. 
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Fix S to be any n x n nondegenerate square distance 
matrix, and let < e < S(S) = |cm(5)/det(S')|. Then 
geom E (S) = gcom(S') = 2^ n \ and hence by Theorem [2] 
the network W obtained from V via the Encoding Rule 
is an exact solution, as its code is given by C(W) — 
X{G{V))=V. 

If V is the /c-sfceZeforQof a clique complex on n ver- 
tices, with k < n — 1, then 

V - X k {G{V)) = {a G X{G{V)) \\a\<k + 1}. 

Any such V can also be exactly encoded. Fix S to be a 
(degenerate) n x n square distance matrix for a config- 
uration of n points that are in general position in M. k , 
and let < e < S(S). Then geom e (5) = geom(S') = 
{cr C [n] I |ct| < k + 1} is the £;-skeleton of 2N. Since 
geom(S') n X(G(V j) = X k (G(V)), Theorem [| implies 
that the network W obtained from V via the Encoding 
Rule is an exact solution to the NE problem for V '. 

It is worth noting here that solutions obtained using 
a degenerate square distance matrix S are not as fine- 
tuned as they might first appear. This is because the 
ratio \cm(S a )/ det(S a )\ approaches zero as subsets of 
points {pi}i£a become approximately degenerate, allow- 
ing elements to be eliminated from geom e (5) because 
of violations to condition (b) in Proposition [l] even if 
condition (a) is not quite violated (see Remark 2 in the 
Appendix) . 

Spurious states, Helly's theorem, and Place Field 
codes 

Recall that our Encoding Rule assumes the synaptic 
strength matrix S is an intrinsic property of the under- 
lying network. Theorem [2] implies that certain "univer- 
sal" choices of S enable any V C to be encoded, 
yielding C{W) = X{G{V)) 2 V. The price to pay, how- 
ever, is the emergence of spurious states. 

Spurious states. Recall that spurious states are ele- 
ments of C(W) that are not in the prescribed list V . We 
can divide them into two types: the first type consists of 
encoded patterns a 6 C(W)\V that are subsets of pat- 
terns in V , while the second type consists of all other el- 
ements of C(W)\V. The first type of spurious states are 
guaranteed to be present for any symmetric encoding 
rule, unless V is a simplicial complex. This is because 
stab(— D + W) is a simplicial complex for symmetric 
W . It is not clear, however, that these states should 
be considered truly "spurious," since they correspond 

4 The fc-skeleton of a simplicial complex is obtained by re- 
stricting to faces of dimension < k, which corresponds to 
elements a C [n] of size \a\ < k + 1. 



to partial patterns whose retrieval does not necessarily 
constitute an "error" on the part of the network. For 
this reason, we restrict attention to the second type, as 
was previously done in |33j . The second type of spurious 
states contains all a G C(W) such that a is not a subset 
of any r G V . Because each such a resulting from our 
Encoding Rule is an element of X(G('P)), we will refer 
to these states from now on as spurious cliques. 

Perhaps surprisingly, some common neural codes 
have the property that the full set of patterns to be 
encoded naturally contains most of the cliques in the 
code's co-firing graph, so that V" w "X(G(V)). Such 
codes have very few spurious cliques. This is precisely 
the case for PF codes. 

PF codes. Let {U\, U n } be a collection of convex 
open sets in R d , where each Ui is the place field cor- 
responding to the zth neuron. To such a set of place 
fields we associate a d- dimensional PF code, V, defined 
as follows: for each a G 2 W, a G V if and only if the 
intersection C\ ierT Ui is nonempty. PF codes are combi- 
natorial neural codes; note that this definition yields a 
simplicial complex, called the nerve of the cover [S] . 

PF codes are experimentally observed in record- 
ings of neural activity in rodent hippocampus |23j . The 
elements of V correspond to subsets of neurons that 
may be co-activated as the animal's trajectory passes 
through a corresponding set of overlapping place fields. 
Typically d = 1 or d = 2, corresponding to the stan- 
dard "linear track" and "open field" environments [23] ; 
it has also been hypothesized that some animals possess 
d = 3 place fields [32 ■ 

Since V is a simplicial complex, encoding a PF code 
using the Encoding Rule produces no spurious states of 
the first type. What about spurious cliques? Remark- 
ably, there are very few of them, since most cliques in 
X(G(V)) are already contained in V . This follows from 
the classical Helly's theorem [5]. 

Helly's theorem. Suppose that Ui,...,Uk is a finite 
collection of convex subsets of Mr, for d < k. If the 
intersection of any d + 1 of these sets is nonempty, then 
the full intersection Hi=i Ui 1S a l so nonempty. To see 
the implications of Helly's theorem for PF codes, we 
first define the notion of Helly completion: 

Definition 2 Let A be a simplicial complex on n ver- 
tices, and let Ad = {o- G A \ \a\ < d + 1} denote 
its (i-skeleton. The Helly completion is the largest sim- 
plicial complex, Ad, on n vertices that has Ad as its 
(i-skeleton. 

For example, the d = 1 Helly completion of a graph G 
is the clique complex X(G). Helly's theorem can now 
be reformulated as: 
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Lemma 2 Let V be a d- dimensional PF code, corre- 
sponding to a set of place fields {U\, U n } where each 
Ui is a convex open set in M. d . Then V is the Helly 
completion of its own d-skeleton: V = Pa- 
in particular, any one-dimensional PF code is al- 
ways a clique complex, and thus has an exact solution 
to the NE problem that can be obtained using the En- 
coding Rule. A two-dimensional PF code V is the Helly 
completion of its own 2-skeleton, which can be obtained 
from knowledge of all pairwise and triple intersections 
of place fields. The only possible spurious cliques are 
therefore spurious triples and the larger cliques of G(V) 
that contain them. These spurious triples emerge when 
three place fields Ui, Uj and Uk have the property that 
each pair intersect, but Ui PI Uj fl Uk = 0- 

Encoding sparse PF codes in threshold-linear 
networks 

Helly's theorem sharply limits the number of spuri- 
ous cliques that result from encoding two-dimensional 
PF codes. For "sparse" PF codes, we find that spuri- 
ous cliques can be further restricted by an appropriate 
choice of S. We also find that PF codes can be en- 
coded from a very small, random sample of patterns. 
The near-exact encoding of PF codes from highly un- 
dersampled data makes them quite natural codes in the 
context of threshold-linear networks. 

Controlling spurious cliques in sparse codes. Ex- 
perimentally observed neural activity in cortical and 
hippocampal areas suggests that neural codes are sparse 
|21l l4], meaning that few neurons are co-active in re- 
sponse to stimuli. If the set of patterns V C 2^ to be 
encoded is a k-sparse code, i.e. if |er| < k < n for all 
a G V, then any clique of size k + 1 or greater in G(V) 
is potentially spurious. We can eliminate these spurious 
states, however, by choosing S in the Encoding Rule to 
be a degenerate square distance matrix for a configura- 
tion of points pi,...,p n G M fe_1 and < e < S(S). This 
guarantees that geom £ (S') does not include any element 
of size greater than k, and hence C(W) C Xk-i(G(V)). 
Note that such a choice of S is "universal," as it works 
for any code V of sparsity k. 

Near-exact encoding of sparse PF codes. Con- 
sider a two-dimensional PF code V that is k-sparse, so 
that no more than k neurons can co-fire in a single pat- 
tern - even if there are higher-order overlaps of place 
fields. Experimental evidence suggests that the fraction 
of active neurons is typically on the order of 5 — 10% 
[3J, so we make the conservative choice of k = .In (our 



results improve with smaller k). In what follows, S and 
e are chosen as above to control spurious cliques of size 
greater than k, and we assume the worst-case-scenario 
of C(W) = X k _i(G(v)), providing an upper bound on 
the number of spurious cliques resulting from our En- 
coding Rule. What fraction of the encoded patterns are 
spurious? This can be quantified by the following error 
probability: 

^ \c(w)\r\ = \Xk-i(G(v))\ - \v\ 

\C(W)\ \X k ^(G(V))\ ■ 

For exact encoding, P erro r = 0, while large numbers of 
spurious states will push P crr0 r close to 1. 



A B 




Fig. 1 PF encoding is near-exact, and can be achieved 
by presenting a small fraction of patterns. (A) P err or was 
computed for randomly generated fc-sparse PF codes having 
n = 80, 90 and 100 neurons and k = .In. For each jitter ra- 
tio, the average value of P error over 100 codes is shown. (B) 
For n — 90, 100 and 110 neurons, fe-sparse PF codes with 
jitter ratio 0.1 were randomly generated and then randomly 
subsampled to contain a small fraction (< 5%) of the total 
number of patterns. After applying the Encoding Rule to the 
subsampled code, the number of encoded cliques was com- 
puted. In each case, the fraction of encoded cliques for the 
subsampled code (as compared to the full PF code) was aver- 
aged over 10 codes. Cliques were counted using Cliquer | 25 |. 
together with custom-made Matlab software. 

To investigate how "exactly" two-dimensional PF 
codes are encoded, we generated random fc-sparse PF 
codes with circular place fields, n = 80-100 neurons, and 
k = An (see the Appendix). Because experimentally 
observed place fields do not have precise boundaries, we 
also generated "jittered" codes, where spurious triples 
were eliminated from the 2-skeleton of the code if they 
did not survive after enlarging the place field radii from 
ro to r\ by a jitter ratio, (ri — ro)/ro. This has the effect 
of eliminating spurious cliques that are unlikely to be 
observed in neural activity, as they correspond to very 
small regions in the underlying environment. For each 
code and each jitter ratio (up to ~ 0.1), we computed 
-Perror using the formula above. Even without jitter, the 
error probability was small, and P er ror decreased quickly 
to values near zero for 10% jitter (Fig. 1A). 
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Encoding full PF codes from highly undersam- 
pled sets of patterns. To investigate what fraction of 
patterns is needed to encode a two-dimensional PF code 
using the Encoding Rule, we generated randomly sub- 
sampled codes from fc-sparse PF codes. We then com- 
puted the number of patterns that would be encoded by 
a network if a subsampled code was presented. Perhaps 
surprisingly, network codes obtained from highly sub- 
sampled PF codes (having only 1-5% of the patterns) 
are nearly identical to those obtained from full PF codes 
(Fig. IB). This is because large numbers of "spurious" 
states emerge when encoding subsampled codes, but 
most correspond to patterns in the full code. The spu- 
rious states of subsampled PF codes can therefore be 
advantageous, allowing networks to quickly encode full 
PF codes from only a small fraction of the patterns. 

Exact solutions to the NE problem 

We have seen that clique complexes and fc-skeleta of 
clique complexes can all be encoded exactly, and two- 
dimensional PF codes can be encoded nearly exactly, 
without tuning the synaptic strength matrix S as a 
function of the patterns to be encoded. If, instead, we 
are allowed to tune S as a function of V, it is clear 
from Theorem [3] that we can obtain exact solutions to 
the NE problem for a wider class of simplicial com- 
plexes. In particular, if T- 3 = geom(S') for some n x n 
matrix S satisfying Sij = Sji > and Su = 0, then 
C(W) = V after applying the Encoding Rule with this 
S and < e < S(S). It follows that any V of the form 
geom(S') admits an exact solution to the NE problem. 

In the special case where S is a square distance ma- 
trix, geom(S') is a representable matroid complex - i.e., 
it is the independent set complex of a real-representable 
matroid (25J. Moreover, it is easy to see that all repre- 
sentable matroid complexes are of this form, and can 
thus be encoded exactly. In addition, Theorem [3] implies 
that we can exactly encode any V C 2^ of the form 

V = A n X(G), where A is a representable matroid 
complex and X(G) is the clique complex of a graph. 
The following example is of this type. 

Example. Suppose V is the two-dimensional simpli- 
cial complex on n — 6 neurons depicted in Figure 2 A. 

V is clearly not a clique complex or the fc-skeleton of a 
clique complex, nor is V a representable matroid com- 
plex, as it violates the independent set exchange prop- 
erty [55]. Nevertheless, there are exact solutions to the 
NE problem for V . One exact solution can be obtained 
by choosing S to be the square distance matrix cor- 
responding to the configuration of points in Figure 2B. 
Another exact solution arises by constructing an S that 



is not a square distance matrix, but has select principal 
submatrices that are (see the Appendix). 




maximal patterns: {1 24},{135},{236},{456] configuration of points for S 



Fig. 2 An example on n = 6 neurons. (A) A simplicial com- 
plex V consisting of four two-dimensional facets (shaded tri- 
angles). The graph G(V) contains the 12 depicted edges. (B) 
A configuration of points pi,...,pg £ K 2 that can be used 
to exactly encode V ■ Lines indicate triples of points that are 
collinear. From this configuration we construct a 6 x 6 synap- 
tic strength matrix S, with Stj = \\pi — Pj\\ 2 , and choose 
< £ < 5(S). The geometry of the configuration implies that 
geom(S) does not contain any patterns of size greater than 3, 
nor does it contain the triples {123}, {145}, {246}, or {356}. 
It is straightforward to check that V = geom(S') n X(G(V)). 

Open questions. We conclude this section with some 
mathematical questions. Can a combinatorial descrip- 
tion be found for all simplicial complexes that are of the 
form geom e (5 l ) or geom(5), where S and e satisfy the 
conditions in Theorem [3]? For such complexes, can the 
appropriate S and e be obtained constructively? Does 
every simplicial complex V admit an exact solution to 
the NE problem via a symmetric network W? I.e., is ev- 
ery simplicial complex of the form geom e (S l )nAT(G(7 : ')), 
as in equation Q? If not, what are the obstructions? 
More generally, does every simplicial complex admit an 
exact solution (not necessarily symmetric) to the NE 
problem? 

Discussion 

Understanding the relationship between the connectiv- 
ity matrix and the activity patterns of a neural network 
is one of the central challenges in theoretical neuro- 
science. We have found that in the context of threshold- 
linear networks, one can obtain an unexpectedly precise 
understanding of the binary activity patterns encoded 
by network steady states. In particular, we have shown 
that these networks naturally encode neural codes aris- 
ing from low-dimensional receptive fields (such as place 
fields) while introducing very few spurious states. Re- 
markably, these codes can be "learned" by the network 
from a highly undersampled set of patterns. 

Neural codes representing (continuous) parametric 
stimuli, such as place field codes, have typically been 
modeled as arising from continuous attractor networks 
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whose synaptic matrices have symmetric "Mexican hat"- 
type connectivity [6,23 . This is in large part due to the 
fact that there is a well-developed mathematical handle 
on these networks [2[10 I 22 . Our work shows that one 
can have fine mathematical control over a much wider 
class of networks, encompassing all symmetric connec- 
tivity matrices. It may thus provide a novel foundation 
for understanding and "engineering" neural networks 
with prescribed steady state properties. 
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Appendix: Proofs and Supporting Text 

Not all codes are realizable by threshold-linear 
networks 

The following result applies to any square matrix A, 
not necessarily symmetric. 

Lemma 3 Let A be an n x n matrix with strictly nega- 
tive diagonal and n > 2. If A is stable, then there exists 
a 2 x 2 principal submatrix of A that is also stable. 

Proof We use the formula for the characteristic poly- 
nomial in terms of sums of principal minors to obtain: 

Pa(X) = {-l) n X n + (—l) n ~ 1 mi(A)X n ~ 1 + 
(-l) n - 2 m 2 (A)X n - 2 + .... + m n {A), 

where m^A) is the sum of the k x k principal minors 
of A. Writing the characteristic polynomial in terms of 
symmetric polynomials in the eigenvalues ai, a 2 , ...,a n , 
and assuming A is stable, we have m,2(A) — Ylj^j ctiCtj > 
0. This implies that at least one 2x2 principal minor 
is positive. Since the corresponding 2x2 principal sub- 
matrix has negative trace, it must be stable. □ 

Combining Lemma [3] with Theorem [l] we obtain: 

Corollary 3 Let V C 2K // there exists a pattern 
a e V such that no order 2 subset of a belongs to V , 
then V is not realizable as C(W) for any threshold-linear 
network W. 



Stable symmetric matrices 

Here we summarize some well-known facts about the 
stability of symmetric matrices that we use in various 
proofs. The first is Cauchy's interlacing theorem, which 
relates eigenvalues of a symmetric matrix to those of its 
principal submatrices. 

Theorem 4 (Cauchy's interlacing theorem [20j ) 

Let A be a symmetric n x n matrix, and let B be an 
Tji x TTi principal submatrix of A. If the eigenvalues of A 
are ot\ < ...<x,-... < a n and those of B are fi\ < ...f3j... < 
j3 m , then aj < /3j < a n - m+j for all j. 

An immediate consequence of this theorem is: 

Corollary 4 Any principal submatrix of a stable sym- 
metric matrix is stable. Any symmetric matrix contain- 
ing an unstable principal submatrix is unstable. 

Another well-known consequence of Cauchy's interlac- 
ing theorem is the following Lemma. Here A^ refers to 
the principal submatrix obtained by taking the upper 
left k x k entries of A. 

Lemma 4 Let A be a real symmetric n x n matrix. 
Then the following are equivalent: 

1. A is a stable matrix. 

2. (-l) fc det(A [k] ) > for all 1 < k < n. 

3. (-1) H det(A CT ) > for every a C [n]. 

Codes of symmetric networks 

Recall from part 1 of Theorem [I] that all codes C(W), 
where W is a threshold-linear network with dynamics 
given by equation 0, have the form 

C(W) = stab(-£> + W). 

Here we show that when W is symmetric (like the net- 
works obtained using the Encoding Rule ([2])), C(W) can 
always be expressed as stab(— 11 T + A) or stab(— xy T + 
B), where —xy T is any rank 1 matrix having strictly 
negative diagonal, and A, B are square matrices with 
zero diagonal. In particular, Lemma [6] gives part 2 of 
Theorem [TJ 

In what follows, we use the notation 

R" = {v e R" | Vi ^ for all i€[n]}, 

for the set of vectors with all nonzero entries. Given a 
vector v € R" and an n x n matrix A, 

A v = diag(v)AdiagO) 

denotes the matrix with entries A^. = ViVjAij. Note 
that for principal submatrices, {A v )„ = (A<j) v " , so we 
simply denote this matrix A v a . This notation will also 
be used later, in Theorem [5| 
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Lemma 5 Let M be a symmetric n x n matrix, and 
v G . Then, 

stab(A:P) = stab(M). 

Proof By Lemma |4j r € stab(M) if and only if 

det(M CT ) > for every a C r. Observe that, 
since AP = diag(v)Af diag(u), we have sgn(det(Af^)) = 
sgn(det(Af CT )) for all a C [n]. It follows that r G stab(AP) 
if and only if r G stab(M). □ 

Lemma 6 For any symmetric threshold-linear network 
W on n neurons, there exists a symmetric nxn matrix 
A with zero diagonal such that 

C(W) = stab(-ll T + A). 

Proof Let x G R™ be the vector such that diag(— xx T ) = 
diag(— D + W), and write 

-D + W = -xx T + (-D + W + xx T ), 

where the term in parentheses is symmetric and has 
zero diagonal. This can be rewritten as 

-D + W = diag(x)(-ll T + A) diag(a;) = (-11 T + A) X , 

where 

A = diag(a:) _1 (-.D + W + xx T ) diag^) -1 

is a symmetric nxn matrix with zero diagonal. It fol- 
lows from Lemma |Hj that C(W) = stab(-£> + W) = 
stab(-ll T +A). a 

Lemma [6] implies that all codes C(W) for symmetric 
networks W have the form C(W) — stab(— 11 T + A), 
where A is a symmetric matrix having zero diagonal. 
The following Proposition implies that all such codes 
can also be obtained by perturbing around any rank 1 
matrix with negative diagonal, not necessarily symmet- 
ric. Note that if x, y G K™ , the rank 1 matrix —xy T has 
strictly negative diagonal if and only if x^yi > for all 
i E [n]. 

Proposition 2 Fix x,y G K™ with xiyi > for all 
i G [ra] . For any symmetric threshold-linear network W 
on n neurons, there exists an nxn matrix B with zero 
diagonal such that 

C{W) = stab(-xy T + B). 

The proof of this Proposition constructs the matrix B 
explicitly, and relies on the following Lemma. 

Lemma 7 Let M be any nxn matrix, and T an nxn 
invertible diagonal matrix. Then 

stablTMT- 1 ) = stab(M). 



Proof We have (TMT^ 1 ) a = T^M^T' 1 . Since conju- 
gation preserves the eigenvalue spectrum, the statement 
follows. □ 

Proof (Proof of Proposition^^ Let W be a symmetric 
threshold-linear network on n neurons. By Lemma [6j 
there exists a symmetric nxn matrix A with zero diag- 
onal such that C(W) — stab(— 11 T + A). It thus remains 
only to construct an ra x n matrix B with zero diagonal 
such that 

stab(-xy T + B) = stab(-ll T + A). 

We prove that this can always be done in two steps: 
first, we prove that it can be done in the special case 
x — y, and then we show that B can be constructed in 
general. 

Step 1: Fix x — y G K"; an( i observe that —xx T + 
A x = (— 11 T +A) X , so by Lemma[5]we have stab(— xx T + 
A x ) = stab(-ll T + A). Letting B = A x we obtain the 
desired statement. 

Step 2: Fix so that Xij/j > for all i£ [ra], 

and let T be the diagonal matrix with entries Tn = 
\fy~ijx~i. Then 

{T(-xy T )T- l ) %:j = J—(-Xiyj)J— = -^/x^y~ i ^x J y j , 

so T{— x?/ T )T _1 = — zz T for z G R™ having entries 
z% = y/viUi- It follows from Step 1 that stab(— 11 T + 

A) = stab(-zz T + A z ) = stab(T(-a;j/ T )T- 1 +A z ). Let 

B = T- y A z T. 

Then, using Lemma[7[ stab(— xy T +B) = stab(T(—xy T + 

B) T^) = stab(-lP + A). Since A has zero diagonal, 
so do A z and B. Note that B can be obtained explicitly, 
using the expression for A in the proof of Lemma [6] □ 

Statement of Theorem [5] and Proof of Proposi- 
tion [1] 

Theorem [5] is our core technical result. It is closely re- 
lated to some relatively recent results in convex geome- 
try, involving correlation matrices and the geometry of 
the "elliptope" [2]. Our proof, however, relies only on 
classical distance geometry and well-known facts about 
stable symmetric matrices. Following the statement we 
prove Proposition [T] from the Main Text, which is es- 
sentially a special case. 

Note that for v G , —vv T is a symmetric rank 
1 matrix with strictly negative diagonal. We will also 
need the following definition. 

Definition 3 A Hebbian matrix A is an n x n matrix 
satisfying Aij = Aji > and An = for all i,j G [ra]. 
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The name reflects the fact that these are precisely the 
types of matrices that arise when synaptic weights are 
modified by a Hcbbian learning rule. 



Theorem 5 Fix v £ 

matrix, 

M = 



™ , and consider the perturbed 
-vv T +sA v , 



where A is a Hebbian matrix and e > 0. Then the fol- 
lowing are equivalent: 

1. A is a nondegenerate square distance matrix. 

2. There exists an e > such that M is stable. 

3. There exists a 5 > such that M is stable for all 

0<e<5. 

cm(A) 

A. < — < oo; and, 

det A 

cm(yl) 

M is stable if and only if < £ < — . 

det .4 

Proof (Proof of Proposition [71] Setting v = 1 £ K™ 
(the column vector of all ones) in Theorem [5] yields a 
slightly weaker version of Proposition [T] from the Main 
Text, with the hypothesis that A is Hebbian, rather 
than merely symmetric with zero diagonal. 

To see why Proposition [l] holds more generally, sup- 
pose A is symmetric with zero diagonal but not Heb- 
bian. Then there exists an off-diagonal pair of negative 
entries, Aij = Aji < 0, and the 2x2 principal subma- 



trix 



(-ll T + E A) m =(_ 1 



-1 
-1 + eAi 



-1 + eAi 
-1 



is unstable as it has negative trace and negative deter- 
minant. It follows from Cauchy's interlacing theorem 
(Corollary |4| that — ll T + eA is unstable for any e > 0. 
Correspondingly, condition (a) in Proposition [TJ is vi- 
olated, as the existence of negative entries guarantees 
that A cannot be a nondegenerate square distance ma- 
trix. □ 



Ingredients for the Proof of Theorem [5] 

Here we present some ingredients necessary for the proof 
of Theorem |H First, we review some classical results 
about square distance matrices. Next, we present a "de- 
terminant lemma" that is critical for our proof. 

Square distance matrices. Recall from the Main Text 
the definitions of square distance matrix, nondegen- 
erate square distance matrix, and Cayley-Menger de- 
terminant. Our convention is that the lxl zero ma- 
trix [0] is a nondegenerate square distance matrix, as 
|cm([0])| = 1 > 0. As an example, a 3x3 symmetric ma- 
trix A with zero diagonal is a nondegenerate square dis- 
tance matrix if and only if the off-diagonal entries Aij 



are all positive, and their square roots (y/ A\2, v^i3i 
and V '^.23) satisfy all three triangle inequalities. 

There are two classical characterizations of square 
distance matrices. The first, due to Menger jS], relies 
on Cayley-Menger determinants. The second, due to 
Schoenberg [30], uses eigenvalues of principal subma- 
trices. Both are needed for our proof of Theorem [5] 

The relationship between Cayley-Menger determi- 
nants and simplex volumes is well-known: 

Lemma 8 Letpi, ..,Pk be k points in a Euclidean space. 
Assume that Aij = \\pi — Pj\\ 2 is the matrix of square 
distances between these points. Then the (k — I) -dim 
volume V of the convex hull of the points {PilfLj can 
be computed as 



V 2 



(-l) fe 



2( fe -!) ((k- 1)!) 



7 cm(A). 



(5) 



In particular, if A is a degenerate square distance ma- 
trix then cva(A) = 0. 

This leads to Menger's characterization of square dis- 
tance matrices. Recall that A a is the principal subma- 
trix obtained by restricting A to the index set a. 

Lemma 9 Let A be an n x n matrix satisfying Aij = 
Aji > and An — for all i,j £ [n] (i.e., A is a 
Hebbian matrix). Then, 

1. A is a square distance matrix if and only if 
(-l)l CT lcm(^ (T ) > for every A a . 

2. A is a nondegenerate square distance matrix if and 
only if {— l)l cr lcm(74 cr ) > for every A„. 

Proof (1) is equivalent to the Corollary of Theorem 42.2 
in [8]. (2) is equivalent to Theorem 41.1 in [8]. □ 

Schoenberg's characterization implies that if a ma- 
trix is a square distance matrix, then the determinant 
of any principal submatrix has opposite sign to that of 
its Cayley-Menger determinant. 

Proposition 3 Let A be an n x n square distance ma- 
trix that is not the zero matrix. Then: 

1. A has one strictly positive eigenvalue and n — 1 
eigenvalues that are less than or equal to zero. In 
particular, (— 1)' CT ' Aet{A a ) < for every principal 
submatrix A a . 

2. If A is a nondegenerate square distance matrix, A 
has no zero eigenvalues and (—1)'°'' det(A cr ) < for 
every principal submatrix A a with \o~\ > 1. 

Proof This Proposition is contained in [T3J Theorem 
6.2.16]. It can also be proven directly from Theorem 1 
of Schoenberg's 1935 paper [30]. □ 
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Corollary 5 If A is an n x n nondegenerate square 
distance matrix with n > 1, then 



cm(A) 
' det A 



> 0. 



The determinant lemma. A cornerstone of the proof 
of Theorem [5] is the following lemma, which allows us 
to connect perturbations of symmetric rank 1 matri- 
ces to Cayley-Menger determinants. The statement is a 
bit more general, however, as —uv T can be any square 
matrix of rank 1. 

Lemma 10 Let u, v £ M. n . Then for any real-valued 
n x n matrix A and any t £ K, 

det(— uv T + t diag(u)Adiag(z;)) 

= det(diag(w) diag(») (t" det A + r'cmfi)) . 
In particular, if u = v £ R™ and t > 0, then 

sgn(det(-w T + tA v )) = sgn (t det A + cm(A)) , 

where sgn : R — > {±1,0} detects the sign of the argu- 
ment. 

Corollary 6 det(-ll T + tA) = t n det A + t n_1 cm(A). 

To prove Lemma[l0j we use the following well-known 
formula for computing the determinant of a 2 x 2 block 
matrix: 



det 



A B 
CD 



= det(A) det(D - CA^B) 



This applies so long as A is invertible. The formula 
follows from observing that 



I 


0" 




'A B~ 




'A B 


-CA- 


1 I 




CD 




-CA^B + D 



Proof (Proof of Lemma \l(fy Note that for any n x n 
matrix A, t £ R, and u,v £ R n , we have 

det(— uv T + t diag(u)Adiag(w)) = 

det(diag(w) diag(w)) det(-ll T + tA), 

where 11 T is the rank 1 matrix of all l's. It thus suffices 
to show that 

det(-ll T 



tA) = t n dsti4 + t n - 1 cm(A), 



where cm(A) is the Cayley-Menger determinant of A. 

Let w,z £ R n , and let Q be any nxn matrix. Using 
the above formula, we have 



det 



1 z 1 
w Q 



det(Q - wz T ). 



On the other hand, the usual cofactor expansion along 
the first row gives 



det 



1 z T ' 

w Q 



det(Q) + det 



z T ' 
w Q 



Therefore, 



det(-wz T + Q) = dct(Q) + det 



z 1 
w G 



In particular, taking w = z = 1 £ K." (the column 
vector of all ones) and Q = tA, we have det(— 11 T + 
tA) = det(tA) + cm(tA) = t n det A + t n - 1 cm(A). □ 



Proof of Theorem [5] 

In addition to the above ingredients, in order to prove 
Theorem [5] we will also need the following technical 
lemma: 

Lemma 11 Fix v £ M" , and let A be an nxn Hebbian 
matrix. If (— l)"cm(yl) < 0, then —vv T + tA v is not 
stable for any t > 0. In particular, if there exists at > 
such that —vv T + tA v is stable, then (— l) n cm(A) > 0. 

The proof of this lemma uses the following convexity 
result: 

Lemma 12 Let M, N be real symmetric nxn matrices 
so that M is negative semidefinite (i.e., all eigenvalues 
are < 0) and N is strictly negative definite (i.e., stable, 
with all eigenvalues < 0). Then tM + (l — t)N is strictly 
negative definite (i.e., stable) for all < t < 1. 

Proof M and N satisfy x T Mx < and x T Nx < for 
all x £ M", so we have x T (tM + (1 - t)N)x < for all 
nonzero x £ R™ if < t < 1. □ 



Proof (Proof of Lemma 11) First, some observations. 

- U 

- T <-tA v 



Since A is symmetric, so are A v and —vv T + tA v for 



any t. Hence, if any principal submatrix of — vv* 
is unstable then — vv T + tA v is unstable (see Corol- 
lary [4]). Therefore, without loss of generality, we can 
assume (— l)^cm(A a ) > for all proper principal sub- 
matrices A a , with \a\ < n (otherwise, we use this ar- 
gument on a smallest principal submatrix such that 
(-l)l <T lcm(A CT ) < 0). By Lemma [9} this implies that 
A a is a nondegenerate square distance matrix for all 
I a I < n, and so we also know by Proposition [3] that 
(— l)' 17 ' det A a < and that each A a has one posi- 
tive eigenvalue and all other eigenvalues negative, for 
all 1 < \a\ < n. 

We prove the lemma by contradiction. Suppose there 
exists a to > such that — vv T +toA v is stable. Applying 
Lemma 12 with M — —vv T and N = —vv T + tgA", we 
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have that -vv T + (1 - t)t A v is stable for all < t < 1. 
It follows that -vv T + tA v is stable for all < t < t . 
Now Lemma [I] implies that (-1)" det(-OT T + tA v ) > 
for all < t < t . By Lemma [TUJ this is equivalent to 
having {~l) n {tdetA+cm{A)) > for all < t < t . By 
assumption, (-l) n cm(A < 0, but if (-l) n cm(A < 0, 
then there would exist a small enough t > such that 
(— l) n (tdetA + cm(A)) < 0, so we can conclude that 
cm(A) = and (-1)™ det A > 0. 

Let Ai < ... < A„ < X n +i denote the eigenval- 

|"0 1 T 

ues of the Cayley-Menger matrix CM(A) = . 

1 Ji 

and observe that A, Ai n _x\, and CM{A\ n _x\) are all 
principal submatrices of CM (A). Since everything is 
symmetric, Cauchy's interlacing theorem applies. We 
have seen above that A n _i] has one positive eigen- 
value and all others negative, so by Cauchy's interlac- 
ing theorem A„+i > and A„_2 < 0. Because cm(A) = 
det CM{A) = 0, CM (A) must have a zero eigenvalue, 
while det A ^ implies (using Cauchy's interlacing the- 
orem) that it is unique. We thus have two cases. 

Case 1: Suppose A„_i = and thus A„ > 0. Since 
we assume (— l) n cm(An-i]) > 0, the n x n ma- 
trix CM(Ar„_ 1 ]) must have an odd number of posi- 
tive eigenvalues, but by Cauchy's interlacing theorem 
the top two eigenvalues must be positive, so we have a 
contradiction. 

Case 2: Suppose A„ = and thus A„_i < 0. Then 
by Cauchy's interlacing theorem, A has exactly one 
positive eigenvalue. On the other hand, the fact that 
( — 1)™ det A > implies that A has an even number of 
positive eigenvalues, which is a contradiction. □ 

We can now prove Theorem [5j 

Proof (Proof of Theorem ^ We prove (4) =>■ (3) =>■ 
(2) => (1) => (4). 

(4) => (3) => (2) is obvious. 

(2) (1): Suppose there exists a t > such that 
— vv T +tA v is stable. Then, by Corollary|4]and Lemma 11 



( — l)' cr 'cm( J 4 (T ) > for all principal submatrices A a . By 
Lemma [9] it follows that A is a nondegenerate square 
distance matrix. 

(1) =>■ (4): Suppose A is a nondegenerate square dis- 
tance matrix. By Lemma[9]we have (— l)' <T 'cm(yl (T ) > 
for all A a , while Proposition|3]implies (— l)' "' det(A CT ) < 
for all A a with \a\ > 1. This implies that for \a\ > 1 
cm(A ff ) > Q (CoroUary 



we have 

det(4r) 
(-l)l ff l(edet(A CT ) 
Applying now Lemma [10 

det(-vv T 



5 ), and that if e > 0, 



cm(A a )) > e < 



eA v ) a > o e < 



cm{A a ) 
'det (A.) 

cm(A a ) 
"det (At)' 



For \a\ = 1, we have diagonal entries A a — A v a — and 
{-vv T ) a < 0, so (-1) det(-w T + eA v ) a > for all e. 
Using Lemma |3J we conclude (assuming e > 0): 



-vv 



eA" is stable <4> e < 8, 



where 



cm(Ar) 
"det(A CT ; 



> 0. 



It remains only to show that 6 = —cm(A)/ det(A). Note 
that we can not use Lemma[l]from the Main Text, since 
this Lemma follows from Proposition [T] and is hence a 
consequence of Theorem [5] 

Because the matrix — vv T + eA v changes from stable 
to unstable at e = 5, by continuity of the eigenvalues 
as functions of e it must be that 

det(-vv T + 6A V ) = 0. 

Using Lemma [l0| it follows that S det (A) + cm(A) = 0, 
which implies 6 — —cm (A)/ det (A). □ 



Remarks on the ratio — ^441 

det(A) 

Remark 1. If A is an n x n nondegenerate square dis- 

cm(A 

tance matrix for n > 1, then the ratio — , . -, , N has a 



very nice geometric interpretation: 



det (A) 



cm(A) 
det (A 



cm(A 



det (A) 



1 



where p is the radius of the unique sphere circumscribed 
on the points used to generate A. This is proven in 
[3 Proposition 9.7.3.7], where it is also shown that 
det (A) 7^ not only if A is a nondegenerate square 
distance matrix, but also if A is a degenerate square 
distance matrix corresponding to n points in general 



position in 



Since cm(A vanishes in this case, we 



see that the ratio 



cm(A 

— — goes smoothly to zero as n 

det (A 

points that are initially in general position in R™ -1 ap- 
proach general position on a hyperplane of dimension 
n- 2. 

Remark 2. The above observations have important 
implications for the apparent "fine-tuning" that is in- 
volved in eliminating spurious cliques by arranging points 
to be collinear, or coplanar, so that the corresponding 
principal submatrix A a is degenerate (as in Figure 2B). 
Since — 11 T + eA a is only stable for 



< £ < 



cm(Ar) 
det (A) 



1 
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where p is the radius of the circumscribed sphere, then 
by making the points {pi}i ea corresponding to A a ap- 
proximately degenerate, p can be made large enough so 
that — 11 T + eA„ is unstable - without the fine-tuning 
required to make A a exactly degenerate. 

Similarly, exact solutions for fc-skeleta of clique com- 
plexes, which seem to require 5 to be a degenerate 
square distance matrix, are also not as fine-tuned as 
they might first appear. If, in fact, S is a nondegen- 
erate square distance matrix, corresponding to a con- 
figuration of n points in K™ -1 that approximately lies 
on a fc-dimensional plane, the value of S(S a ) will be 
very small for any pattern of size \a\ > k + 1; one can 
thus choose e large enough to ensure that geom £ (S) = 
{a C [n] | |cr| < k + 1}, as in the case where S is truly 
degenerate. 

Remark 3. It is quite simple to understand the scaling 
properties of —cm(A)/ det(A). If A is any n x n matrix, 
then cm(tA) = i"" 1 cm(yl), while det(iA) = t n det(A), 
so 



cm(tA) 
'det(tA) 



cm(A) \ 
~det(A) ) 



independent of n. If Aij = \\pi — Pj|| 2 , for pi, ...,p„ £ 
Ml" -1 , and we scale the position vectors so that pi H> tpi 
for each i E [n], then A i-> t 2 A and we have 

cm(A) 1 / cm(A)\ 
~det(A) ^ ¥ \det{A) ) ' 

This is consistent with the fact that the radius of the 
circumscribed sphere, p, scales as p i— > tp in this case 
(see Remark 1). 

Remark 4. Consider an n x n matrix A satisfying the 
Hebbian conditions Aij = Aji > and An = 0. If n 
is large, it is computationally intensive to test whether 
or not A is a nondegenerate square distance matrix us- 
ing the criteria of Lemma [9j which potentially require 
computing cm(A a ) for all a C [n]. 

On the other hand, our results imply that in order 
to test whether or not a Hebbian matrix A is a nonde- 
generate square distance matrix it is enough to check 
the stability of the matrix 



-11 J 



eA. for e = 



1 



cm(A) 



det(A) 



Here the factor of 1/2 was chosen somewhat arbitrarily, 
and can be replaced with any number < c < 1. For 
large n, this is a computationally efficient strategy, as 
it requires checking the eigenvalues of just one matrix. 

Remark 5. To use truly binary synapses, we can choose 
S in the Encoding Rule to be the uniform synaptic 
strength matrix having = 1 for i ^ j and Su = for 



all i e [n]. In fact, S is a nondegenerate square distance 
matrix, and the ratio S(S) = |cm(5')/ det(S')| = 



i- 1 

turns out to have a very simple form. Similarly, any 
k x k principal submatrix S^, with \a\ = k, satisfies 



This implies that geom e (S') is the k- 



k-1 

skeleton of the complete simplicial complex on n ver- 

k + 2 k + 1 „ , 
tices it < e < . By the same argument as 

k + 1 k 6 

above, for this choice of S and e the Encoding Rule 

yields C{W) = X k (G(V)) = V, with W an exact so- 
lution for V . Note that if we choose < e < 1, then 
geom e (S") = geom(S) = 2^, so the resultingC(W) D V 
for any choice of V (c.f. [33J. 



Details related to generation of PF codes for 
Figure 1 

To produce Figure 1, we generated random fc-sparse PF 
codes with circular place fields, n = 80-100 neurons, 
and k — An. For each code, n place field centers were 
selected uniformly at random from a square box envi- 
ronment of side length 1, and n place field radii were 
drawn independently from an experimentally observed 
gamma distribution (Figure 3). We then computed the 
2-skeleton for each PF code, with pairwise and triple 
overlaps of place fields determined from simple geomet- 
ric considerations. The full PF code was obtained as 
the Helly completion of the 2-skeleton (see Lemma [2| . 
Finally, to obtain the fc-sparse PF code, we restricted 
the full code to its (k— l)-skeleton, thereby eliminating 
patterns of size larger than k. 




0.1 0.2 0.3 



Fig. 3 Gamma distribution used for generating random 
place field radii; this fits the experimentally-observed mean 
and variability (see | 12l Figure 4B]). 



Another exact solution for Figure 2 example 

Recall the simplicial complex in Figure 2A. Let S be 
the symmetric matrix defined by the following equa- 



tions for i < j: Sij 
S23 — S26 — S36 = 



= 1 if i 

3 2 ; and Si 



1; S 2 



S 3 
5 2 if i - 



= i; 

4 or 



5. (Here we've assigned values corresponding to each 
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edge in G^); remaining entries may be chosen arbi- 
trarily, as they play no role.) Note that S is not a 
square distance matrix. Choose < e < S(S), so that 
C{W) is given by ^ after applying the Encoding Rule 
with V . It is straightforward to check that, among all 
cliques of X(G(T')), only the desired patterns are en- 
coded. For example, {124} € C(W) because S , / 12 4} is 
a nondegenerate square distance matrix, as the square 
roots of the entries satisfy all triangle inequalities. In 
contrast, a triangle inequality is violated for each of 
{123}, {145}, {246}, and {356}. 
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